LRTDP Versus UCT for Online Probabilistic Planning

نویسندگان

چکیده

UCT, the premier method for solving games such as Go, is also becoming dominant algorithm probabilistic planning. Out of five solvers at International Probabilistic Planning Competition (IPPC) 2011, four were based on UCT algorithm. However, while a UCT-based planner, PROST, won contest, an LRTDP-based system, Glutton, came in close second, outperforming other systems derived from UCT. These results raise question: what are strengths and weaknesses LRTDP practice? This paper starts answering this question by contrasting two approaches context finite-horizon MDPs. We demonstrate that scenarios, UCT's lack sound termination condition serious practical disadvantage. In order to handle MDP with large finite horizon under time constraint, forces expert guess non-myopic lookahead value which it should be able converge encountered states. Mistakes setting parameter can greatly hurt performance. contrast, LRTDP's convergence criterion allows iterative deepening strategy. Using strategy, automatically finds largest feasible given constraint. As result, has better performance stronger theoretical properties. present online version named Gourmand, illustrates analysis outperforms PROST set IPPC-2011 problems.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LRTDP Versus UCT for Online Probabilistic Planning

UCT, the premier method for solving games such as Go, is also becoming the dominant algorithm for probabilistic planning. Out of the five solvers at the International Probabilistic Planning Competition (IPPC) 2011, four were based on the UCT algorithm. However, while a UCT-based planner, PROST, won the contest, an LRTDP-based system, GLUTTON, came in a close second, outperforming other systems ...

متن کامل

PROST: Probabilistic Planning Based on UCT

We present PROST, a probabilistic planning system that is based on the UCT algorithm by Kocsis and Szepesvári (2006), which has been applied successfully to many areas of planning and acting under uncertainty. The objective of this paper is to show the application of UCT to domainindependent probabilistic planning, an area it had not been applied to before. We furthermore present several enhanc...

متن کامل

Bidirectional Online Probabilistic Planning

We present Bidirectional Online Probabilistic Planner (BOPP)1, a novel planner that combines elements of Decision Theoretic Planning(DTP) and forward search. In particular, BOPP uses a combination of SPUDD and Upper Confidence Trees(UCT). We present our approach and some experimental results on the domains presented in the boolean fluents MDP track of the International Probabilistic Planning Co...

متن کامل

Improving UCT planning via approximate homomorphisms

In this paper we show how abstractions can help UCT’s performance. Ideal abstractions are homomorphisms because they preserve optimal policies, but they rarely exist, and are computationally hard to find even when they do. We show how a combination of (i) finding local abstractions in the layered-DAG MDP induced by a set of UCT trajectories (rather than finding abstractions in the global MDP), ...

متن کامل

A Robotic Execution Framework for Online Probabilistic (Re)Planning

Due to the high complexity of probabilistic planning algorithms, roboticists often opt for deterministic replanning paradigms, which can quickly adapt the current plan to the environment’s changes. However, probabilistic planning suffers in practice from the common misconception that it is needed to generate complete or closed policies, which would not require to be adapted on-line. In this wor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v26i1.8362